NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

gsplat: An Open-Source Library for Gaussian Splatting

Ye, Vickie; Li, Ruilong; Kerr, Justin; Turkulainen, Matias; Yi, Brent; Pan, Zhuoyang; Seiskari, Otto; Ye, Jianbo; Hu, Jeffrey; Tancik, Matthew; et al (February 2025, Journal of machine learning research)

gsplat is an open-source library designed for training and developing Gaussian Splat- ting methods. It features a front-end with Python bindings compatible with the Py- Torch library and a back-end with highly optimized CUDA kernels. gsplat o↵ers nu- merous features that enhance the optimization of Gaussian Splatting models, which in- clude optimization improvements for speed, memory, and convergence times. Experimen- tal results demonstrate that gsplat achieves up to 10% less training time and 4⇥ less memory than the original Kerbl et al. (2023) implementation. Utilized in several re- search projects, gsplat is actively maintained on GitHub. Source code is available at https://github.com/nerfstudio-project/gsplat under Apache License 2.0. We wel- come contributions from the open-source community.
more » « less
Free, publicly-accessible full text available February 1, 2026
Robot See Robot Do: Imitating Articulated Object Manipulation with Monocular 4D Reconstruction

Kerr, Justin; Kim, Chung_Min; Wu, Mingxuan; Yi, Brent; Wang, Qianqian; Goldberg, Ken; Kanazawa, Angjoo (November 2024, Conference on Robot Learning)

Humans can learn to manipulate new objects by simply watching others; providing robots with the ability to learn from such demonstrations would enable a natural interface specifying new behaviors. This work develops Robot See Robot Do (RSRD), a method for imitating articulated object manipulation from a single monocular RGB human demonstration given a single static multi-view object scan. We first propose 4D Differentiable Part Models (4D-DPM), a method for recovering 3D part motion from a monocular video with differentiable rendering. This analysis-by-synthesis approach uses part-centric feature fields in an iterative optimization which enables the use of geometric regularizers to recover 3D motions from only a single video. Given this 4D reconstruction, the robot replicates object trajectories by planning bimanual arm motions that induce the demonstrated object part motion. By representing demonstrations as part-centric trajectories, RSRD focuses on replicating the demonstration's intended behavior while considering the robot's own morphological limits, rather than attempting to reproduce the hand's motion. We evaluate 4D-DPM's 3D tracking accuracy on ground truth annotated 3D part trajectories and RSRD's physical execution performance on 9 objects across 10 trials each on a bimanual YuMi robot. Each phase of RSRD achieves an average of 87% success rate, for a total end-to-end success rate of 60% across 90 trials. Notably, this is accomplished using only feature fields distilled from large pretrained vision models -- without any task-specific training, fine-tuning, dataset collection, or annotation.
more » « less
Full Text Available
GARField: Group Anything with Radiance Fields

Kim, Chung_Min; Wu, Mingxuan; Kerr, Justin; Goldberg, Ken; Tancik, Matthew; Kanazawa, Angjoo (June 2024, CVPR)

Grouping is inherently ambiguous due to the multiple levels of granularity in which one can decompose a scene -- should the wheels of an excavator be considered separate or part of the whole? We present Group Anything with Radiance Fields (GARField), an approach for decomposing 3D scenes into a hierarchy of semantically meaningful groups from posed image inputs. To do this we embrace group ambiguity through physical scale: by optimizing a scale-conditioned 3D affinity feature field, a point in the world can belong to different groups of different sizes. We optimize this field from a set of 2D masks provided by Segment Anything (SAM) in a way that respects coarse-to-fine hierarchy, using scale to consistently fuse conflicting masks from different viewpoints. From this field we can derive a hierarchy of possible groupings via automatic tree construction or user interaction. We evaluate GARField on a variety of in-the-wild scenes and find it effectively extracts groups at many levels: clusters of objects, objects, and various subparts. GARField inherently represents multi-view consistent groupings and produces higher fidelity groups than the input SAM masks. GARField's hierarchical grouping could have exciting downstream applications such as 3D asset extraction or dynamic scene understanding. See the project website at https://www.garfield.studio/
more » « less
Full Text Available
All You Need is LUV: Unsupervised Collection of Labeled Images Using UV-Fluorescent Markings

https://doi.org/10.1109/IROS47612.2022.9981768

Thananjeyan, Brijen; Kerr, Justin; Huang, Huang; Gonzalez, Joseph E.; Goldberg, Ken (October 2022, 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS))

Full Text Available
PRIMAL: Pathfinding via Reinforcement and Imitation Multi-Agent Learning

https://doi.org/10.1109/LRA.2019.2903261

Sartoretti, Guillaume; Kerr, Justin; Shi, Yunfei; Wagner, Glenn; Kumar, T. K.; Koenig, Sven; Choset, Howie (July 2019, IEEE Robotics and Automation Letters)

Full Text Available

Search for: All records